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Directory Assistance Problem and High-Level Design Proposal 
Abstract 


Edge TRILL (Transparent Interconnection of Lots of Links) switches 
currently learn the mapping between MAC (Media Access Control) 
addresses and their egress TRILL switch by observing the data packets 
they ingress or egress or by the TRILL ESADI (End-Station Address 
Distribution Information) protocol. When an ingress TRILL switch 
receives a data frame for a destination address (MAC&Label) that the 
switch does not know, the data frame is flooded within the frame’s 
Data Label across the TRILL campus. 


This document describes the framework for using directory services to 
assist edge TRILL switches in reducing multi-destination frames, 
particularly unknown unicast frames flooding, and ARP/ND (Address 
Resolution Protocol / Neighbor Discovery), thus improving TRILL 
network scalability and security. 


Status of This Memo 


This document is not an Internet Standards Track specification; it is 
published for informational purposes. 


This document is a product of the Internet Engineering Task Force 


(IETF). It represents the consensus of the IETF community. It has 
received public review and has been approved for publication by the 
Internet Engineering Steering Group (IESG). Not all documents 


approved by the IESG are a candidate for any level of Internet 
Standard; see Section 2 of RFC 5741. 


Information about the current status of this document, any errata, 


and how to provide feedback on it may be obtained at 
http://www.rfc-editor.org/info/rfc7067. 
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1. Introduction 


Edge TRILL (Transparent Interconnection of Lots of Links) switches 
(devices implementing [RFC6325], also known as RBridges) currently 
learn the mapping between destination MAC addresses and their egress 
TRILL switch by observing data packets or by the ESADI (End-Station 
Address Distribution Information) protocol. When an ingress RBridge 
(Routing Bridge) receives a data frame for a destination address 
(MAC&Label) that RBridge does not know, the data frame is flooded 
within that Data Label across the TRILL campus. (Data Labels are 
VLANs or FGLs (Fine-Grained Labels [FGL]). 


This document describes a framework for using directory services in 
environments where such services are available, such as typical data 
centers, to assist edge TRILL switches. This assistance can reduce 
multi-destination frames, particularly ARP [RFC826], ND [RFC4861], 
and unknown unicast, thus improving TRILL network scalability. In 
addition, the information provided by a directory can be more secure 
than that learned from the data plane (see Section 7). 


Data centers, especially Internet and/or multi-tenant data centers, 
tend to have a large number of end stations with a wide variety of 
applications. Their networks differ from enterprise campus networks 
in several ways that make them attractive for the use of directory 
assistance, in particular: 


1. Data center topology is often based on racks and rows. 
Furthermore, a Server/VM (virtual machine) Management System 
orchestrates the assignment of guest operating systems to servers, 
racks, and rows; it is not done at random. So, the information 
necessary for a directory is normally available from that 
Management System. 


2. Rapid workload shifting in data centers can accelerate the 
frequency of the physical servers being reloaded with different 
applications. Sometimes, applications loaded into one physical 
server at different times can belong to different subnets. When a 
VM is moved to a new location or when a server is loaded with a 
new application with different IP/MAC addresses, it is more likely 
that the destination address of data packets sent out from those 
VMs are unknown to their attached edge RBridges. 


3. With server virtualization, there is an increasing trend to 
dynamically create or delete VMs when the demand for resources 
changes, to move VMs from overloaded servers to less loaded 
servers, or to aggregate VMs onto fewer servers when demand is 
light. This results in the more frequent occurrence of multiple 
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subnets on the same port at the same time and a higher change rate 
for VMs than for physical servers. 

Both items 2 and 3 above can lead to applications in one subnet being 
placed in different locations (racks or rows) or one rack having 
applications belonging to different subnets. 

2. Terminology 

The terms "VLAN" and "Data Label" are used interchangeably with 
"Subnet" in this document, because it is common to map one subnet to 
one VLAN or FGL. 

Bridge: Device compliant with IEEE Std 802.10-2011 [802.10]. 


Data Label: VLAN or FGL 


EoR: End-of-Row switches in a data center. Also known as 
aggregation switches. 


End Station: Guest OS running on a physical server or on a virtual 
machine. An end station in this document has at least one 
IP address, at least one MAC address, and is connected to a 


network. 
FGL: Fine-Grained Label [FGL] 
IS-IS: Intermediate System to Intermediate System routing protocol. 


TRILL uses IS-IS [IS-IS] [RFC6326]. 
RBridge: "Routing Bridge", an alternate name for a TRILL switch. 


ToR: Top-of-Rack switches in a data center. Also known as access 
switches in some data centers. 


TRILL: Transparent Interconnection of Lots of Links [RFC6325] 
TRILL Switch: A device implementing the TRILL protocol [RFC6325]. 


VM: Virtual Machine 
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3. Impact of Massive Number of End Stations 


This section discusses the impact of a massive number of end stations 
in a TRILL campus using Data Centers as an example. 


3.1. Issues of Flooding-Based Learning in Data Centers 


It is common for Data Center networks to have multiple tiers of 
switches, for example, one or two Access Switches for each server 
rack (ToR), aggregation switches for some rows (or EoR switches), and 
some core switches to interconnect the aggregation switches. Many 
aggregation switches deployed in data centers have high port density. 
It is not uncommon to see aggregation switches interconnecting 
hundreds of ToR switches. 


4------- + +------ + 
+/------ + | +/----- + | 
| Aggr11| quss |AggrN1 | + EOR switches 
+---+---+/ +------ +/ 
/ \ / \ 
/ \ / \ 
+--+ +---+ +--+ +---+ 
|T11|... [Tix] |T21| .. |T2y| ToR switches 
+--+ +---+ +---+ +---+ 
| | | | 
+-|-+ +-|-+ +-|-+ +-|-+ 
| deed |g | de d d 
T---t T---t T---t t---* Server racks 
| deed |] |I d d d 
+---+ +---+ +---+ +---+ 
|I deed |g pøs Ull 
+---+ +---+ +---+ +---+ 


Figure 1: Typical Data Center Network Design 


The following problems could occur when TRILL is deployed in a data 
center with a large number of end stations and when the end stations 
in one subnet/Label are placed under multiple edge RBridges: 


- Unnecessary filling of slots in the MAC address learning table 
of edge RBridges, e.g., RBridge Tll, due to Tll receiving 
broadcast/multicast traffic (e.g., ARP/ND, cluster multicast, 
etc.) from end stations under other edge RBridges that are not 
actually communicating with any end stations attached to T11. 


- Packets being flooded across a TRILL campus when their 


destination MAC addresses are not in the ingress RBridge’s MAC 
address to the egress RBridge cache. 
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3.2. Two Examples 


Consider a data center with 1,600 server racks. Each server rack has 
at least one ToR switch. The ToR switches are further divided into 8 
groups, with each group being connected by a set of aggregation 
switches. There could be 4 to 8 aggregation switches in each set to 
achieve load sharing for traffic to/from server racks. Let's 
consider the following two scenarios for the TRILL campus boundary if 
TRILL is deployed in this data center environment: 


- Scenario fl: TRILL campus boundary starts at the ToR switches: 


If each server rack has one ToR, there are 1,600 edge RBridges. 
If each rack has two ToR switches, then there will be 3,200 
edge RBridges. 


In this scenario, the TRILL campus will have more than 1,600 
(or 3,200) + 8*4 (or 8*8) nodes, which is a large IS-IS area. 
Even though a mesh IS-IS area can scale up to thousands of 
nodes, it is challenging for aggregation switches to handle 
IS-IS link state advertisement among hundreds of parallel 
ports. 


If each ToR has 40 downstream ports facing servers and each 
server has 10 VMs, there could be 40*10 = 400 end stations 
attached. If those end stations belong to 8 Labels, then the 
total number of MAC&Label entries learned by each edge RBridge 
in the worst case might be 400*8 = 3,200, which is not a large 
number. 


- Scenario #2: TRILL campus boundary starts at the aggregation 
switches: 


With the same assumptions as before, the number of nodes in the 
TRILL campus will be less than 100, and aggregation switches 
don’t have to handle IS-IS link state advertisements among 
hundreds of parallel ports. 


However, the number of MAC&Label <-> Egress RBridge mapping 
entries to be learned and managed by the RBridge edge node can 
be very large. In the example above, each edge RBridge has 200 
edge ports facing the ToR switches. If each ToR has 40 
downstream ports facing servers and each server has 10 VMs, 
there could be 200*40*10 = 80,000 end stations attached. If 
all those end stations belong to 1,600 Labels (50 per Data 
Label) and each Data Label has 200 end stations, then under the 
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worst-case scenario, the total number of MAC&Label entries to 
be learned by each edge RBridge can be 1,600*200=320,000, which 
is very large. 


4. Benefits of Directory-Assisted TRILL Edge 


In some environments, particularly data centers, the assignment of 
applications to servers, including rack and row selection, is 
orchestrated by Server (or VM) Management System(s). That is, there 
is a database or multiple databases that have the knowledge of where 
each application is placed. If the application location information 
can be fed to RBridge edge nodes through some form of directory 
service, then there is much less chance of RBridge edge nodes 
receiving unknown MAC destination addresses, therefore less chance of 
flooding. 


Avoiding unknown unicast address flooding to the TRILL campus is 
especially valuable in the data center environment, because there is 
a higher chance of an edge RBridge receiving packets with an unknown 
unicast destination address and broadcast/multicast messages due to 
VM migration and servers being loaded with different applications. 
When a VM is moved to a new location or a server is loaded with a new 
application with a different IP/MAC addresses, it is more likely that 
the destination address of data packets sent out from those VMs is 
unknown to their attached edge RBridges. In addition, gratuitous ARP 
(IPv4 [RFC826]) or Unsolicited Neighbor Advertisement (IPv6 


[RFC4861]) sent out from those newly migrated or activated VMs have 
to be flooded to other edge RBridges that have VMs in the same 
subnets. 


The benefits of using directory assistance include: 


- Avoids flooding an unknown unicast destination address across 
the TRILL campus. The directory-enforced MAC&Label <-> Egress 
RBridge mapping table can determine if a data packet needs to 
be forwarded across the TRILL campus. 


When multiple RBridge edge ports are connected to end stations 
(servers/VMs), possibly via bridged LANs, a directory-assisted 
edge RBridge won’t need to flood unknown unicast destination 
data frames to all ports of the edge RBridges in the frame’s 
Data Label when it ingresses a frame. It can depend on the 
directory to locate the destination. When the directory 
doesn’t have the needed information, the frames can be dropped 
or flooded depending on the policy configured. 
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- Reduces flooding of decapsulated Ethernet frames with an 
unknown MAC destination address to a bridged LAN connected to 
RBridge edge ports. 


When an RBridge receives a unicast TRILL data packet whose 
destination Nickname matches with its own, the normal procedure 
is for the RBridge to decapsulate it and forward the 
decapsulated Ethernet frame to the directly attached bridged 
LAN. If the destination MAC is unknown, the RBridge floods the 
decapsulated Ethernet frame out all ports in the frame's Data 
Label. With directory assistance, the egress RBridge can 
determine if the MAC destination address in a frame matches any 
end stations attached via the bridged LAN. Frames can be 
discarded if their destination addresses do not match. 


- Reduces the amount of MAC&Label «-» Egress RBridge mapping 
maintained by edge RBridges. There is no need for an edge 
RBridge to keep MAC entries of remote end stations that don't 
communicate with the end stations locally attached. 


- Eliminates ARP/ND being broadcast or multicast through the 
TRILL core. 


- Provides some protection against spoofing of source addresses 
(see Section 7). 


Generic Operation of Directory Assistance 


There are two different models for directory assistance to edge 
RBridges: Push Model and Pull Model. The directory information is 
described in Section 5.1 below, while Section 5.2 discusses Push 
Model requirements, and Section 5.3 Pull Model requirements. 


1. Information in Directory for Edge RBridges 
To achieve the benefits of directory assistance for TRILL, the 


corresponding Directory Server entries will need, at a minimum, the 
following logical data structure: 


[IP, MAC, Data Label, (list of attached RBridge nicknames), (list of 
interested RBridges]] 


The (list of attached RBridges) are the edge RBridges to which the 
host (or VM) is attached as specified by the [IP, MAC, Data Label] in 
the entry. The (list of interested RBridges) are the remote RBridges 
that might have attached hosts that communicate with the host in this 
entry. 
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When a host has multiple IP addresses, there will be multiple 
entries. 


The {list of interested RBridges} could get populated when an RBridge 
queries for information, or information is pushed from a Directory 
Server. The list is used to notify those RBridges when the host 
(specified by the [IP, MAC, Data Label]) in the entry changes its 
RBridge attachment. An explicit list in the directory is not needed 
as long as the interested RBridges can be determined. 


5.2. Push Model and Requirements 


Under this model, Directory Server(s) push the MAC&Label <-> Egress 
RBridge mapping for all the end stations that might communicate with 
end stations attached to an RBridge edge node. If the packet’s 
destination address can’t be found in the MAC&Label <-> Egress 
RBridge table, the Ingress RBridge could be configured to: 


simply drop a data packet, 
flood it to the TRILL campus, or 


start the pull process to get information from the Pull Directory 
Server(s). 


It may not be necessary for every edge RBridge to get the entire 
mapping table for all the end stations in a campus. There are many 
ways to narrow the full set down to a smaller set of remote end 
stations that communicate with end stations attached to an edge 
RBridge. A simple approach is to only push the mapping for the Data 
Labels that have active end stations under an edge RBridge. This 
approach can reduce the number of mapping entries being pushed. 


However, the Push Model will usually push more entries of MAC&Label 
<-> Egress RBridge mapping to an edge RBridges than needed. Under 
the normal process of edge RBridge cache aging and unknown 
destination address flooding, rarely used mapping entries would have 
been removed. But it can be difficult for Directory Servers to 
predict the communication patterns among applications within one Data 
Label. Therefore, it is likely that the Directory Servers will push 
down all the MAC&Label entries if there are end stations in the Data 
Label attached to the edge RBridge. This is a disadvantage of the 
Push Model compared with the Pull Model described below. 


In the Push Model, it is necessary to have a way for an RBridge node 
to request Directory Server(s) to push the mapping entries. This 
method should at least include the Data Labels enabled on the 
RBridge, so that the Directory Server doesn’t need to push down the 
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entire set of mapping entries for all the end stations in the campus. 
An RBridge must be able to get mapping entries when it is initialized 
or restarted. 


The Push Model's detailed method and any handshake mechanism between 
an RBridge and Directory Server(s) is beyond the scope of this 
framework document. 


When a Directory Server needs to push a large number of entries to 
edge RBridges, efficient data organization should be considered, for 
example, with one edge RBridge nickname being associated with all the 
attached end stations” MAC addresses and Data Labels. As shown in 
Table 1 below, to make the data more compact, a representation can be 
used where a nickname need only occur once for a set of Labels, each 
of which occurs only once and each of which is associated with a set 
of multiple IP and MAC address pairs. It would be much more bulky to 
have each IP and MAC address pair separately accompanied by its Label 
and by the nickname of the RBridge by which it is reachable. 


+------------ +—======== 4-------------------------------- 4 
| Nicknamel |Label-1 | IP/MAC1, IP/MAC2, ,, IP/MACn | 

-------- 4--------------------------------X 
| Label-2 | IP/MAC1, IP/MAC2, ,, IP/MACn | 
| | -------- 4----22222222222222222222222-2-2--- + 
| |}, iaria | IP/MAC1, IP/MAC2, ,, IP/MACn | 
4------------ +-—======= 4-------------------------------- + 
| Nickname2 |Label-1 | IP/MAC1, IP/MAC2, ,, IP/MACn | 

-------- 4--------------------------------X 
| Label-2 | IP/MAC1, IP/MAC2, ,,IP/MACn | 
| |-------- R E + 
| | | IP/MAC1, IP/MAC2, ,, IP/MACn | 
+------------ +-------- +—=============================== + 
| ------- | -------- $iseseeaSekae Sos ssse seen ses + 
| | | IP/MAC1, IP/MAC2, ,, IP/MACn | 
+------------ +-------- 4-------------------------------- t 


Table 1: Summarized Table Pushed Down from Directory 


Whenever there is any change in MAC&Label «-» Egress RBridge mapping 
that can be triggered by end stations being added, moved, or 
decommissioned, an incremental update can be sent to the edge 
RBridges that are impacted by the change. Therefore, something like 
a sequence number has to be maintained by Directory Servers and 
RBridges. Detailed mechanisms will be specified in a separate 
document. 
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5.3. Pull Model and Requirements 


Under this model, an RBridge pulls the MAC&Label <-> Egress RBridge 
mapping entry from the Directory Server when its cache doesn’t have 
the entry. There are a couple of possibilities for triggering the 
pulling process: 


- The RBridge edge node can send a pull request whenever it 
receives an unknown MAC destination, or 


- The RBridge edge node can intercept all ARP/ND requests and 
forward them or appropriate requests to the Directory Server(s) 
that has the information on where the target end stations are 
located. 


The Pull Directory response could indicate that the address being 
queried is unknown or that the requestor is administratively 
prohibited from getting an informative response. 


By using a Pull Directory, a frame with an unknown MAC destination 
address doesn't have to be flooded across the TRILL campus and the 
ARP/ND requests don't have to be broadcast or multicast across the 
TRILL campus. 


The ingress RBridge can cache the response pulled from the directory. 
The timer for such a cache should be short in an environment where 
VMs move frequently. The cache timer could be configured by the 
Management System or sent along with the Pulled reply by the 
Directory Server(s). It is important that the cached information be 
kept consistent with the actual placement of addresses in the campus; 
therefore, there needs to be some mechanism by which RBridges that 
have pulled information that has not expired can be informed when 
that information changes or becomes invalid for other reasons. 


One advantage of the Pull Model is that edge RBridges can age out 
MAC&Label entries if they haven't been used for a certain configured 
period of time or a period of time provided by the directory. 
Therefore, each edge RBridge will only keep the entries that are 
frequently used, so its mapping table size will be smaller. Edge 
RBridges would query the Directory Server(s) for unknown MAC 
destination addresses in data frames or ARP/ND and cache the 
response. When end stations attached to remote edge RBridges rarely 
communicate with the locally attached end stations, the corresponding 
MAC&VLAN entries would be aged out from the RBridge's cache. 


An RBridge waiting for a response from Directory Servers upon 
receiving a data frame with an unknown destination address is similar 
to an Layer-3/Layer-2 boundary router waiting for an ARP or ND 
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response upon receiving an IP data packet whose destination IP is not 
in the router’s IP/MAC cache table. Most deployed routers today do 
hold the packet and send ARP/ND requests to the target upon receiving 
a packet with a destination IP not in its IP-to-MAC cache. When 
ARP/ND replies are received, the router will send the data packet to 
the target. This practice minimizes flooding when targets don’t 
exist in the subnet. 


When the target doesn’t exist in the subnet, routers generally resend 
an ARP/ND request a few more times before dropping the packets. So, 
if the target doesn’t exist in the subnet, the router’s holding time 
to wait for an ARP/ND response can be longer than the time taken by 
the Pull Model to get IP-to-MAC mapping from a Directory Server. 


RBridges with mapping entries being pushed from a Directory Server 
can be configured to use the Pull Model for targets that don’t exist 
in the mapping data being pushed. 


A separate document will specify the detailed messages and mechanism 
for RBridges to pull information from Directory Server(s). 


6. Recommendation 


TRILL should provide a directory-assisted approach. This document 
describes a basic framework for directory assistance to RBridge edge 
nodes. More detailed mechanisms will be described in a separate 
document or documents. 


7. Security Considerations 


For general TRILL security considerations, see Section 6 of 
[RFC6325]. 


Accurate mapping of IP addresses into MAC addresses and of MAC 
addresses to the RBridges from which they are reachable is important 
to the correct delivery of information. The security of specific 
directory-assisted mechanisms for delivering such information will be 
discussed in the document or documents specifying those mechanisms. 


A directory-assisted TRILL edge can be used to substantially improve 
the security of a TRILL campus over TRILL’s default MAC address 
learning from the data plane. Assume S is an end station attached to 
RB1 trying to spoof a target end station T and that T is attached to 
RB2. Perhaps S wants to steal traffic intended for T or forge 
traffic as if it was from T. 
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With that default TRILL data-plane learning as described in 
[RFC6325], S can impersonate T or any other end station in the same 
Data Label (VLAN or FGL [FGL]) as S and possibly other Data Labels, 
depending on how tightly VLAN admission and Appointed Forwarders 
[RFC6439] are configured at the port by which S is connected to RB1. 
S can just send native frames with the forged source MAC addresses of 
T, perhaps broadcast frames for maximum effectiveness. With this 
technique, S will frequently receive traffic intended for T and S can 
easily forge traffic as being from T. 


Such spoofing can be prevented to the extent that the network 
RBridges (1) use trusted directory services as described above in 
this document, (2) discard native frames received from a local end 
station when the directory says that end stations should be remote, 
and, (3) when appropriate, intercept ARP and ND messages and respond 
locally. Under these circumstances, S would be limited to spoofing 
targets on the same RBridge as the ingress RBridge for S (that is, 
RB1 = RB2). RBI would still need to learn which local end stations 
were attached to which port, and S could confuse RB1 by sending 
frames with the forged source MAC address of other end stations on 
RB1. Although it would also still be restricted to frames in a VLAN 
that would both be admitted by S's port of attachment and for which 
that port is an Appointed Forwarder. 


Security against spoofing could be even further strengthened by 
adding port of attachment information to the directory and discarding 
native frames that are received on the wrong port. This would limit 
S to spoofing targets that were on the same link as S and in a VLAN 
admitted by the port of that link's attachment to RB1 and for which 
that port is an Appointed Forwarder (or, if the link is multiply 
connected, in the same way at all of the ports by which the link is 
attached to an RBridge). 


Even without directory services, secure ND [RFC3971] or use of secure 
ESADI (as described in [ESADI]) may also be helpful to security. 
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